Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge of a Disjunctive Orthography
نویسندگان
چکیده
Prefixes of the Setswana verb • The subject agreement morphemes, written disjunctively, include non-consecutive subject agreement morphemes and consecutive subject agreement morphemes. For example, the non-consecutive subject agreement morpheme for class 5 is le as in lekau le a tshega (the young man is laughing), while the consecutive subject agreement morpheme for class 5 is la as in lekau la tshega (the young man then laughed). • The object agreement morpheme is written disjunctively in most instances, for example ba di bona (they see it). • The reflexive morpheme i(-self) is always written conjunctively to the root, for example o ipona (he sees himself). • The aspectual morphemes are written disjunctively and include the present tense morpheme a, the progressive morpheme sa (still) and the potential morpheme ka (can). Examples are o a araba (he answers), ba sa ithuta (they are still learning) and ba ka ithuta (they can learn). • The temporal morpheme tla (indicating the future tense) is written disjunctively, for example ba tla ithuta (they shall learn). • The negative morphemes ga, sa and se are written disjunctively. Examples are ga ba ithute (they do not learn), re sa mo thuse (we do not help him), o se mo rome (do not send him).
منابع مشابه
Finite state tokenisation of an orthographical disjunctive agglutinative language: The verbal segment of Northern Sotho
Tokenisation is an important first pre-processing step required to adequately test finite-state morphological analysers. In agglutinative languages each morpheme is concatinatively added on to form a complete morphological structure. Disjunctive agglutinative languages like Northern Sotho write these morphemes, for certain morphological categories only, as separate words separated by spaces or ...
متن کاملMorphosyntactic discrepancies in representing the adjective equivalent in African WordNet with reference to Northern Sotho
This paper aims to highlight morphosyntactic discrepancies encountered in representing the adjective equivalent in African WordNet, with reference to Northern Sotho. Northern Sotho is an agglutinating language with rich and productive morphology. The language also features a disjunctive orthographic system. The orthography determines the attachment selection of morphemes. The immediate issue, i...
متن کاملConventional Orthography for Dialectal Arabic
Dialectal Arabic (DA) refers to the day-to-day vernaculars spoken in the Arab world. DA lives side-by-side with the official language, Modern Standard Arabic (MSA). DA differs from MSA on all levels of linguistic representation, from phonology and morphology to lexicon and syntax. Unlike MSA, DA has no standard orthography since there are no Arabic dialect academies, nor is there a large edited...
متن کاملNonlinear disjunctive kriging for the estimating and modeling of a vein copper deposit
ABSTRACT Estimation of mineral resources and reserves with low values of error is essential in mineral exploration. The aim of this study is to estimate and model a vein type deposit using disjunctive kriging method. Disjunctive Kriging (DK) as an appropriate nonlinear estimation method has been used for estimation of Cu values. For estimation of Cu values and modelling of the distributio...
متن کاملThe Effect of L1 Persian on the Acquisition of English L2 Orthographic System on the Shared Grounds
This paper elaborates on Persian and English orthographic shared aspects to study the effects of L1 Persian on learning English as a foreign language. While there are some examples of letter and sound mismatches in the orthographic system of both languages, those of English are more complex than Persian. In order to see the effect of the mismatch between orthography and transcription, 40 Persia...
متن کامل